Internet Search Engine Freshness by Web Server Help

نویسندگان

Vijay Gupta

Roy H. Campbell

چکیده

We study how to keep the Internet search engines up-todate with the changes occurring at the various web servers in the Internet. Currently, web search engines poll the web servers on a per-URL basis for obtaining update information. We advocate an approach in which web servers themselves track the changes happening to their content files for propagating updates to search engines. We propose an algorithm which uses both freshness and popularity of data at the web servers for deciding the discrepancy between a web site and a search engine. This algorithm batches the push of updates from the web server to the search engine. We prove that this algorithm is competitive with an optimal algorithm.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Workload-Aware Web Crawling and Server Workload Detection

With the development of search engines, more and more web crawlers are used to gather web pages. The rising crawling traffic has brought the concern that crawlers may impact web sites. On the other hand, more efficient crawling strategy is required for the coverage and freshness of search engine index. In this paper, crawlers of several major search engines are analyzed using one six-months acc...

متن کامل

An Approach to Design Incremental Parallel Webcrawler

World Wide Web (WWW) is a huge repository of interlinked hypertext documents known as web pages. Users access these hypertext documents via Internet. Since its inception in 1990, WWW has become many folds in size, and now it contains more than 50 billion publicly accessible web documents distributed all over the world on thousands of web servers and still growing at exponential rate. It is very...

متن کامل

Reducing Network Traffic and Managing Volatile Web Contents Using Migrating Crawlers with Table of Variable Information

As the size of the web continues to grow, searching it for useful information has become increasingly difficult. Also study reports that sufficient of current internet traffic and bandwidth consumption are due to the web crawlers that retrieve pages for indexing by the different search engines. Moreover, due to the dynamic nature of the web, it becomes very difficult for a search engine to prov...

متن کامل

Improving the Information Retrieval in the World Wide Web

In this paper we expose a visualization system suitable to be installed on any Internet search engine or directory. This system is based on a new user interface, which makes more comfortable the user’s search and the navigation through the results. This new interface consists on just one window and all the Web pages selected by users are downloaded in background, without disturbing the user int...

متن کامل

Use of Fuzzy C-Means Algorithm for Web Proxy Server Performance Improvement

Now a days the web is loaded with lot of request from users and it creates a lot of traffic on the web. As the requests are increasing the resources in the World Wide Web are also rising to large extent. In addition the services and applications provided by the web are directly proportional to its growth. For this reason, web traffic is huge, and to gain access to these resources incurs user-pe...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2001

Internet Search Engine Freshness by Web Server Help

نویسندگان

چکیده

منابع مشابه

Workload-Aware Web Crawling and Server Workload Detection

An Approach to Design Incremental Parallel Webcrawler

Reducing Network Traffic and Managing Volatile Web Contents Using Migrating Crawlers with Table of Variable Information

Improving the Information Retrieval in the World Wide Web

Use of Fuzzy C-Means Algorithm for Web Proxy Server Performance Improvement

عنوان ژورنال:

اشتراک گذاری